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How Hard is Computing Parity with Noisy Communications?* 

Chinmoy Dutta^ Yashodhan Kanoria^ D. Manjunath^ Jaikumar Radhakrishnan^ 


Abstract 

We show a tight lower bound of f2(iVloglog-/V) on the number of transmissions required to com¬ 
pute the parity of N input bits with constant error in a noisy communication network of N randomly 
placed sensors, each having one input bit and communicating with others using local transmissions with 
power near the connectiviw threshold. This result settles the lower bound question left open by Ying, 
Srikant and Dullerud (WiOpt 06), who showed how the sum of all the TV bits can be computed us¬ 
ing 0{N log log N) transmissions. The same lower bound has been shown to hold for a host of other 
functions including majority by Dutta and Radhakrishnan (FOCS 2008). 

Most works on lower bounds for communication networks considered mostly the full broadcast 
model without using the fact that the communication in real networks is local, determined by the power 
of the transmitters. In fact, in full broadcast networks computing parity needs 9{N) transmissions. To 
obtain our lower bound we employ techniques developed by Goyal, Kindler and Saks (FOCS 05), who 
showed lower bounds in the full broadcast model by reducing the problem to a model of noisy decision 
trees. However, in order to capture the limited range of transmissions in real sensor networks, we adapt 
their definition of noisy decision trees and allow each node of the tree access to only a limited part of the 
input. Our lower bound is obtained by exploiting special properties of parity computations in such noisy 
decision trees. 


1 Introduction 

Since inexpensive wireless technology and sensing hardware have become widely available and are heavily 
used, much recent effort has been devoted to developing models for these networks and protocols based 
on these models. A wireless sensor network consists of sensors that collect and cooperatively process data 
in order to compute some global function. The sensors interact with each other by transmitting wireless 
messages based on some protocol. The protocol is required to tolerate errors in transmissions since wireless 
messages typically are noisy. 

In the problem we study, each sensor is required to detect a bit; then, all the sensors are required to 
collectively compute the parity of these bits. The difficulty of this task, of course, depends on the noise and 
the connectivity of the network. In this paper, we assume that each bit sent is flipped (independently for each 
receiver) with probability e > 0 during transmission. As for connectivity, we adopt the widely used model 
of random planar networks. Here the sensors are placed randomly and uniformly in a unit square. Then 
each transmission is assumed to be received (with noise) by the sensors that are within some prescribed 
radius of the sender. The radius is determined by the amount of power used by the sensors, and naturally 
one wishes to keep the power used as low as possible, perhaps just enough to ensure that the entire network 
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is connected. If the network is not connected then it cannot be expected to compute a function like parity 
which depends on all the input bits. It has been shown by Gupta and Kumar f7! that the threshold radius for 

connectivity is 9 ^ random planar network of N sensors placed in a unit square. With a radius 

much smaller than this the network will not be connected almost surely, and with radius much larger it will 
be connected almost surely. 

Our work is motivated by a protocol presented by Ying, Srikant and Dullerud lIT^ for computing the 
sum of all the bits (and hence any symmetric functions of these bits). They showed that even with radius of 
transmission just near the connectivity threshold, and constant noise probability, one can compute the sum 
using a total of 0{N log log N) transmissions. They observed the (trivial) lower bound of N transmissions 
(for every sensor must send at least one message), but left open the possibility of better upper bounds. One 
can compute the parity of the input bits from their sum; in fact, Ying et al. suggested that parity computation 
might be significantly easier than computing the sum. In this work, we prove a lower bound showing that the 
protocol of Ying et al. is optimal up to constant factors for computing the parity (and hence, also the sum) 
of the input bits. In order to state our result formally we need to define fhe model of noisy communicafion 
nefworks. 

Definition 1 (Noisy communication nefwork and profocol). A communication network is an undirected 
graph G whose vertices correspond to sensors and edges correspond to communication links. A message 
sent by a sensor is received by all its neighbors. 

Noise: In an e-noise network, the messages are subjected to noise as follows. Suppose sensor v sends bit b 
in time step t. Each neighbor ofv then receives an independent noisy version ofb; that is, the neighbor 
w of V receives the bit b 0 riw,t< where r]w,t cin e-noisy bit (that takes the value 1 with probability e 
and 0 with probability 1 — e), these noisy bits being mutually independent for different neighbors. 

Input: An input to the network is an assignment of bits to the sensors, and is formally an element of 

Protocol: A protocol on G for computing a function f : {0, ^ {0, 1} works as follows. The sensors 

take turns to send single bit messages, which are received only by the neighbors of the sender. In the 
end, a designated sensor v* (G) declares the answer. The cost of the protocol is the total number 
of bits transmitted. A message sent by a sensor in some time step is a function of the bits that it 
possesses, which include its input bit and the noisy copy of the bits transmitted by its neighbors until 
then. The protocol with cost T is thus specified by a sequence of T vertices {vi,V 2 ,.. ■ ,vt) and 
a sequence of T functions {gi,g 2 , ■ ■ ■ ,gT), where gt : {0, Ip* —> {0, 1} and jt is the number of 
bits possessed by vt before time step t. Furthermore, vt = v*, and the final answer is obtained by 
computing gT. Note that in our model the number of transmissions is the same for all inputs. 

Error: Such a protocol is said to be a 6-error protocol, if for all inputs x € {0, Pr[output = 

/(x)] > 1 — (5. Here the probability is over the noise in the communication channel as well as the 
internal randomness, if any, of the protocol. 

In this paper, we consider networks that arise out of random placement of sensors in the unit square. 

Definition 2 (Random planar network). A random planar network Nf{N, R) is a random variable 
whose values are undirected graphs. The distribution of the random variable depends on two parame¬ 
ters: N, the number of vertices, and R, the transmission radius. The vertex set ofN'(N,R) is V{N) = 
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{Pi, ^ 2 ; • • •; Pn}- The edges are determined as follows. First, these vertices are independently placed at 
random, uniformly in the unit square [0,1]^. Then, 


E{N) = {{Pi,Pj) : clist(Pi,P,) < R}, 


where dist(Pj, Pj) is the Euclidean distance between vertices Pi and Pj. 

The result in this paper is the following. 

Theorem 3 (Lower bound for parity). Let R < N~^ for some /3 > 0. Let 6 < ^ and e € (0,1). 
Then, with probability 1 — o(l) over the random variable M{N,R), every 5-error protocol on J\f{N,R) 
with e-noise for computing the parity function © : {0,l+lj ~1} requires Ll{N loglog N) trans¬ 
missions. 

Remark 4. Our definition of noise assumes that all transmissions are subjected to noise with probability 
exactly e. In the literature, other models of error have been considered. Some protocols work even in the 
weaker model where this probability is at most e. Our lower bound holds for the stronger model with the 
noise parameter being exactly e, and hence is also applicable to the weaker model. 

Remark 5. VTe require only an upper bound on the transmission radius. However, the result is meaningful 

only when R = for otherwise, with high probability, the network is not connected and cannot 

be expected to compute any function that depends on all its input bits. 

Remark 6. Trivially, this lower bound also holds for computing the sum of the input bits. 

1.1 Related work 

The most commonly studied noisy communication model allows full broadcasts, that is, all sensors receive 
all messages (with independent noise). In this model, Gallager ||5l considered the problem of collecting all 
the bits at one sensor, and showed how this could be done using 0{N log log N) transmissions; this implies 
the same upper bound for computing any function of the input bits. More recently, in a remarkable result, 
Goyal, Kindler and Saks |i51 showed that Gallager’s protocol was the best possible for collecting all the bits. 
However, they do not present any boolean function for which Q{N log log N) transmissions are required. 

In the full broadcast model, protocols for computing specific functions have also been studied in the 
literature. Feige and Raghavan [4] presented a protocol with 0(A^log* N) transmissions for computing the 
OR of N bits; this result was improved by Newman ifTll . who gave a protocol with 0{N) transmissions. For 
computing threshold functions Kushilevitz and Mansour ifTOl showed a protocol with 0{N) transmissions, 
assuming that all messages are subject to noise with probability exactly e. Under the same assumption, 
Goyal, Kindler and Saks |6| showed that the sum of all the bits (and hence all symmetric functions) could 
be computed with 0{N) transmissions. 

In this paper we are concerned with networks arising from random placement of sensors, where con¬ 
siderations of power impose stringent limits on the transmission radius. In this model, Ying, Srikant and 
Dullerud presented a protocol for computing the sum of all the bits as mentioned above. Kanoria and 
Manjunath f9| gave a protocol that uses 0{N) transmissions to compute the OR function. However, no 
non-trivial lower bound that apply specifically fo communicafion nefworks wifh limifed fransmission radius 
had appeared in fhe liferafure before fhis work. Subsequenf fo fhe initial presenfafion of fhis work 1^, Duffa 
and Radhakrishnan O showed fhaf fhe same lower bound of Ll{N log log N) holds for computing a hosf of 
boolean funcfions including fhe majorify function. 
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1.2 Techniques 


We now present an overview of the proof technique used to derive our lower bound. As we explain in more 
detail in the Section ??, the proof has two parts. The first part is geometric. Since the transmission radius 
is limited, it is possible to decompose the nodes of the communication network into clusters. The nodes in 
the interior of each cluster will continue to receive inputs and will be called input nodes, but those on the 
boundary will have their inputs fixed (arbitrarily) and thereby become auxiliary nodes that still participate 
in the protocol by sending and receiving messages. This decomposition of the communication network into 
clusters ensures that any node can receive transmissions from input nodes of at most one cluster. This allows 
us to view the protocol as a combination of several subprotocols acting on different clusters and interacting 
with each other via the auxiliary nodes. This graph theoretic decomposition is based on routine arguments 
involving the distribution points chosen independently and uniformly at random on the unit square. 

The second part of the proof is combinatorial and concerns arguing that the subprotocols acting on 
different clusters of the decomposed network can be assumed to be independent of each other. This part is 
not straightforward and we need to revisit the arguments used by Goyal, Kindler and Saks ||6l to obtain their 
lower bounds. A key insight in their proof was that protocols in noisy communication networks could be 
translated into what they called Generalized Noisy Decision trees (gnd trees). We adapt their argument to our 
setting. For us it is important to ensure that the decomposition of the network (which was the consequence 
of the limited transmission radius) is reflected in the noisy decision trees we construct. So, we define a 
notion of noisy decision trees appropriate for our setting, where we allow each node of the tree access to the 
inputs of only one cluster. We show how efficient protocols on decomposed networks can be translated to 
such decision trees of small depth. 

The argument this far was general and did not use the fact that the ultimate goal of the protocol is to 
compute the parity function. Next we show that we can rearrange the decision tree so that the queries made 
to the variables in the same cluster of the decomposition appear at adjacent levels of the tree. This part 
crucially depends on the fact that we are trying to compute the parity function. After the rearrangement, we 
can view the entire computation as a sequence of noisy decision tree computations, one for each cluster. We 
conclude that in order to have low overall error, the computation in each cluster must have vanishingly small 
error probability. At this stage we can directly apply a result of Goyal, Kindler and Saks |6l, which states 
that any decision tree that computes the parity function with error o(l) must have superlinear depth. This 
dependence of depth on error is strong enough to yield our lower bound. 

The interesting feature of this argument is that we work with appropriately defined decision trees instead 
of directly with the decomposed protocol. Once inputs of processors have been fixed, they become auxiliary. 
However, they continue to participate in the protocol. In particular, they receive transmissions from proces¬ 
sors with inputs and can potentially aid error correction by providing additional reception diversity, which 
is crucially exploited in many of the upper bounds. So it is not true that our decomposition immediately 
breaks the protocol into independent subprotocols, operating separately on different clusters. Nevertheless, 
when we translate the decomposed protocol into our model of decision trees, we can view the computation 
of the entire decision tree as a combination of independent decision subtrees, operating separately on dif¬ 
ferent clusters. This provides us the required product property, from which one easily deduces that each 
individual subtree must compute the parity within its cluster very accurately. For an detailed discussion of 
this technique as well as those developed to analyze functions where we do not have the product property, 
we refer the reader to the Phd thesis |T|. 
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1.3 Organization of the paper 

Section |2] presents some definitions and notations. In Section [3j we state two lemmas corresponding to the 
two parts of the argument, and derive the lower bound for parity. The details of the first part of the argument 
are presented in SecctionH] The second part of the argument is spread over Sections [5] andWe conclude 
the paper in Section |7] 

2 Preliminaries 

In our proof, some of the nodes in the network will receive no input. We now introduce the terminology 
applicable in such situations. 

Definition 7 (Input and auxiliary nodes). Let G = {V, E) be a communication network. We partition 
the set of nodes, V, into the set of input nodes, I, and the set of auxiliary nodes, A. Nodes in I receive inputs 
and those in A do not receive any input but have their input bits fixed arbitrarily. An input to such a network 
is an element o/{0,1}^ and a protocol on such a network computes a function f : {0,1}^ {0,1}. 

Next we formalize the notions of network decomposition and bounded protocols on such decomposed 
networks. 

Definition 8 (Network decomposition and bounded protocols). Let G = {I Li A, E) be a communication 
network. An (n, k)-decomposition ofG is a partition of the set of nodes of G of the form I = /i U • • • U 
and A = AqVJ AiU ■ ■ ■ VJ A}^ such that for j = 1,... ,k, 

(PI) \Lj\ = n, and 

(P2) the neighborhood of I j is contained in Lj U Aj. 

A protocol li on G is said be a (d, D)-bounded protocol with respect to the decomposition {Aq, (Ij, Aj) : 
j = l,...,k) if for j = l,...,k, 

(P3) a node in Lj makes at most d transmissions, and 

(P4) all nodes in Ij U Aj put together make at most D transmissions. 

We use the notation e-noise (n, k, d, D)-protocol to mean a (d, D)-bounded protocol for some (n, k)- 
decomposed network with noise parameter e. 

As stated earlier, we will use the method of Goyal, Kindler and Saks [i&l to translate a communication 
protocol into a noisy decision tree. We now present the terminology for noisy decision trees. 

Definition 9 (Decision tree). Let S be an arbitrary set and k be a positive integer. A decision tree T 
for the set of inputs is a balanced tree where each internal node v is labelled by a pair {iv,gv) where 
iv G [k], gy : S ^ Gy, and Gy is the set of children ofv. We call the tree to be a noisy decision tree if the 
functions gy are noisy. A noisy function is one whose output depends on its input as well as some internal 
randomness. Such a tree T computes a function from to the set LfT) of leaves ofT as follows: on 
input {xi,X 2 , ■ ■ ■, Xk) € S^, the computation starts at the root and determines the next vertex to visit after a 
vertex v by evaluating gv{xi ^); the leaf reached in the end is the result of the computation. If a vertex iy = i 
for a vertex v, then we say that the i-th input variable is queried at that vertex. We say that the decision 
tree is oblivious if the label iy of a vertex v depends only on the level ofv (distance from the root). We say 
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that an oblivious decision tree is ordered if for all j € [k] all queries to the the j-th input variable appear 
at consecutive levels. We say that an oblivious decision tree is read-once if each input variable is queried 
exactly once. 

Remark 10. We use the notation {n,k)-decision tree to refer to a decision tree for inputs in where 
S = {0,1}". 

Remark 11. A read-once decision tree is obviously ordered. Also, an ordered decision tree can be easily 
made read-once by collapsing consecutive queries to the same variable into one supernode. 

As in 1>6], in order to capture the noise in a noisy communication network, we define a special kind 
of noisy decision tree, Xored-Noise Decision tree (xnd-tree). Here we allow each of the the functions g.f, 
access to its input variable xored with some noise variable. These noise variables are set according to some 
distribution based on a noise parameter e, but independent of the input. 

Definition 12 (xnd tree). An (n, k, D, e) — xnd tree T is an (n, k)-noisy decision tree. It consists of an 
oblivious decision tree T on inputs where S = {0,1}" x ({0, (for some index set A), and each 
function g.^ has a special form: 

gv{Xi„,ZiJ = 

for some p} : {0,1}" and A„ € A. Each input is queried at most D times in the tree. The computation 

ofT proceeds as follows: on input x G ({0,1}")*^, each Zi^\ G {0,1}" is chosen independently according 
to the binomial distribution B{n, e). Once the entire input {x,z) G is determined, we compute T{x,z) 
as in Defnition^above. 

Remark 13. When k = 1, the trees defined in the above definition correspond to the gnd trees of Goyal, 
Kindler and Saks jj^. 

Let A be an algorithm to process inputs from some set S. The usefulness of A to compute some boolean 
function / on input set S is captured by the notion of its advantage. 

Definition 14 (Advantage). Let p be a distribution on some set S. Let f : S —> {+1,-1} and A : S ^ 
C, where C is some set. Then, the advantage of A for f under p is given by 

adv/,^(^) = max \E[f{X)a{A{X))]\, 

where X is a random variable taking values in S with distribution p. We will use this notation even when A 
corresponds to a randomized algorithm, in which case, the expectation is computed over X as well as the 
internal random choices made by A 

Definition 15. For a distribution p on {0,1}", let 

a^{n,D,e) = maxadv®,^(T), 
where T ranges over all (n, 1, D, e) — xnd trees. 

3 Lower bound for parity 

Our lower bound proof has two parts. In this section, we will summarize the results of these two parts of the 
argument in the form of lemmas. Then, using these lemmas we will prove the main theorem. The lemmas 
themselves will be proved in the next three sections. 
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3.1 First part of the proof 

This part of our argument is based on the observation that in a random planar network, nodes are typically 
distributed uniformly over the entire area. By fixing the inputs of some of the nodes (and thereby making 
them auxiliary), we can create ‘buffer zones’ of auxiliary nodes so that the remaining nodes now fall into 
large number of well-separated large clusters. 

Lemma 16. Suppose R < N~^,for some /3 > 0. Then, with probability 1 — o(l) over the random variable 
M{N, R), the following holds: if 

there is a 5-error protocol on M with e-noise for computing the parity function (on N bits) with 
T transmissions, 

then 

there is an (n, k)-decomposition of M and a 5-error e-noise (n, k, d, D)-protocol with respect 
to this decomposition for computing parity (on nk bits), where n = Tl{NR?), k = Tl(l/R‘^), 
d = 0{T/N) and D = 0{TR^). 

This lemma is proved in Section ??. 

3.2 second part of the proof 

In the second part of our argument, we analyze such bounded protocols on decomposed networks. Our 
analysis closely follows that of Goyal, Kindler and Saks fG]. For showing lower bounds on the number of 
transmissions in a noisy communication protocol, Goyal et al. translated such protocols into gnd trees. 

since we want to analyse bounded protocols for decomposed networks, we first translate such protocols 
into xnd-trees. Then we argue that if the inputs come from a product distribution, then xnd-trees for com¬ 
puting parity can be rearranged to get ordered xnd-trees, and hence read-once noisy decision trees (using 
Remark fTTl). 

Lemma 17 (Translation from protocols to read-once decision trees). For any e-noise (n, k, d, D)- 
protocol n and any distribution p on {0,1}"^, there is a read-once noisy (n, k)-decision tree T such that 

• advg3^^fc(r) > advgg^^fc(n); 

• < a^(n, 3D, e^)for every function g that appears in T. 

Next we observe the following ’product property’ for the advantage of read-once noisy decision trees. 

Lemma 18 (Advantage of read-once decision trees). Let h : {0,1}" —>■ {+1, —1}. Suppose T is a 
read-once (n, k)-decision tree for computing f : ({0,1}*^)^ —>■ —1} defined by f{{xi,X 2 , ■ ■ ■, Xk)) = 

\^=ih{xi). Suppose, for each function A that appears in T we have a6yh,^l{g) < Then, advj^fc(T) < 
a^. 

The above two lemmas give the main lemma of the second part of our proof. 

Lemma 19. For all distributions p on {0,1}*^ and all e-noise (n, k, d, D)-protocol If, we have 

advg3,^fc(n) < a^(n,3L>,e'^)^. 

Proof Immediate from LemmafTTland LemmafT^ □ 

Section ?? is devoted to proving Lemma [TTl and Section ?? proves Lemma [TSl 


7 


3.3 Putting the two parts together 

To complete the proof of our lower bound, we need the following result of |i 6 |. 


Definition 20. Let f : {0,1}"^ —> {0,1} be any function. The sensitivity of f at input x G {0,1}”, denoted 
Sxif), is the number of indices i G [n] such that f changes value upon flipping the ith bit of x. The 
sensitivity of f, denoted s{ f), is the maximum of Sx{f) over all x. 

Theorem 21 (Goyal, Kindler and Saks f 6 ] (Theorem 32)). Let e G (0,1/2) and S G (0,1/16), and let f 
be an n-variate boolean function. Any randomized gnd tree T that for every input x, outputs f{x) with 
probability 1 — 6 when run with noise parameter e satisfies: 


depth(T) > 


log(l/4<5) 

501og^(l/e) 


We will restate the above theorem for the case of parity in terms of advantage of xnd trees. 

Theorem 22 (Restatement of Theorem |2T]). Let p be the distribution on {0,1}" defined by pfT^) = ^ and 
p{e) = ^for all e G {0,1}” of weight 1. Then 


afj.{n,D,e) 


< max 


1 — exp 



Dlog^il/e) 


e^n 



( 1 ) 


Proof of the restatement. Let p be as given in the theorem. Theorem [2T] is proved in [! 6 J by proving an 
upper bound on the probability that T is correct when T is executed on an input selected at random from 
the distribution p. Thus any gnd tree T that makes an average error of at most 6 < 1/16 for computing the 
parity function © : { 0 , 1 }”^ —> { 0 , 1 } on inputs from the distribution p, when run with noise parameter e, 
must have 


depth(r) > 


log(l/45) 

___ 1 _ —Yi 

501og^(l/e) 


since the sensitivity of the parity function © : {0,1}”^ {0,1} is n. As the RHS of the above equation is 

strictly decreasing with 6, we conclude that any (n, 1, D, e) — xnd tree T makes an average error of at least 
6' for computing the parity function on inputs from the distribution p, where 


6' = min 




log^(l/e)Zl 





Thus adv 0 _^(T) < 1 — 26', which proves the theorem. 


□ 


Proof of Theorem^ Let p be the distribution defined in Theorem l22l By combining Lemmas [T6] and [T^ 
we conclude that with probability 1 — o(l) over the random variable R), the following is true: if there 

is a (5-error protocol on AA(A", R) with e-noise for computing the parity function with T transmissions, then 

1 - 2(5 < a^(n,3D,e'^)^, 


where n = n{NR‘^), k = n{l/R‘^), d = 0{T/N) and D = 0{TR^). 

Since R < N~l^, k = Ll{l/R‘^) and (5 is a constant, a^{n, 3D, must be inverse polynomially close 
to 1. Let k > C/R^ and d < C'T/N for some constants C, C. From ([T]), we thus get 


1 — 2(5 < 1 — exp —O 


ri?2log2(l/eC^'T/iV) 

j^pi2^2C'T/N 








Denoting T/N by S and simplifying, we have 


1 — 2(5 < exp — exp —O 


(Slog\l/e^'^)\\ C\ 

)) Ry' 


Taking logarithm and noting that R < N 






^2C'S 


From this we get, 


51og2(l/e^'^) 

f2C'S 


C 


> C'logN, 


1 


1 - 26 


for some constant C". This yields S = D(log log N) and hence T = Q.{N log log N). 


□ 


4 Decomposition of random planar networks 


The random placement of nodes in the unit square typically arranges them uniformly. We will exploit this 
uniformity to obtain the required decomposition. 


Lemma 23 (Chernoff bounds). Let X be the sum of N independent identically distributed indicator 
random variables. Let p = E[X]. Then, Pr[X < < exp(—0.15^). 

Proof. The lemma follows immediately from the following version of the Chernoff bound due to Hoeffd- 
ing I'S ]: if the random variable X has binomial distribution B{N,p), then 


Pr[X > {p + d)N] < 


p \ 1-p \ 

p + s) vi-p-(5y 


{l-p-S)N 


( 2 ) 


To derive the lemma, we consider the random variable Y = N — X, and apply ^ with p = 1 — and 
(5 = ^, to obtain 

Pi[X < ^p] < Pr[y >{p + 6)N] 

(r^j 

< exp(—(5A^) • 2^ 

< exp ^-^(1 - ln2)p^ 

< exp(—0.15/r). 


□ 

Proof of lemma\T^ We tessellate the unit square into M = ([l/i2j cells, each a square of side We 

number the rows and columns of this tessellation using indices in {1, 2,... , }, and refer to the cell in 

the z-th row and j-th column by Cij. The expected number of processors in any one cell is /r = N/M. Since 
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R > y^lOln A^/iV, we have /U > 10 In N, and by Lemma 1231 the probability that there are fewer than /x/2 
processors in any one cell is is o{-^). So, with probability 1 — o(l), all cells have at least ///2 = N/ (2M) 
processors. 

Now, let <Si = {cij : i = 1 (mod 3) andy = 1 (mod 3)}. Then, |5i| > M/9. For each c E Si, 
let the neighborhood of c, denoted by r(c), be the set of (at most nine) cells that are at distance less than R 
from c. Note that distinct cells in 5i have disjoint neighborhoods. If the total number of transmissions in the 
original protocol is T, then the average number of transmissions made from r(c) as c ranges over 5i is at 
most 9T/M. By Markov’s inequality, for at least half the cells c E 5i fewer than IST/M transmissions are 
made from r(c). Let 52 be the set of these cells; |52| > M/18. For each cell c E 52, we identify the set 
of \N/ (4M)] processors that make fewest transmissions. We are now ready to describe the decomposition 
of the planar communication network. 

The set of input processors will be / = Ucg 52 input of all processors not in I at 0, and 

treat them as auxiliary processors. The protocol continues to compute the parity of the inputs provided to 
processors in I. For c E 52, let be the set of auxiliary processors in the cells in r(c). Also let Aq 
be the set of all those auxiliary processors that are not in r(c) for any c E 52. We have thus obtained a 
decomposition (Aq, {Ic, Ac) : c E 52), such that 

(a) the number of input classes in the decomposition is /c = |52| > M/18; 

(b) each input class has n = [///4] processors; 

(c) The total number of transmissions made by all processors in R U Ac is at most D = 18r/M; 

(d) The total number of transmissions made by any one processor in Ic is at most d = D/n = 72T/N. 

Thus we have obtained an (n, A:)-decomposition of the network J\f and the original protocol now reduces to 
a <5-error e-noise (n, k, d, L>)-protocol with respest to this decomposition for computing the parity function 
on n/cbits, where n > NR^/A, k> ^Yl/R\^,d < 72T/N and D < 18Ti?^. □ 

5 Translation from protocols to read-once decision trees 

In this section, we will first translate bounded protocols for decomposed networks into xnd trees. Then we 
will show how we can rearrange oblivious decision trees in some cases to make them ordered. These two 
steps will then enable us to prove lemma [TT] 

5.1 From bounded protocols to xnd trees 

Lemma 24. For any e-noise {n,k,d,D)-protocol TI and any distribution p on ({0, l}*^)^, there is an 
(n, k, 3D, e'^j-xnd tree T such that adv 0 ^^(T) > adv 0 ^^(n). 

Proof. We will carry out the translation from bounded protocols to xnd trees via two intermediate models 
of communication protocols. 

Definition 25 (Intermediate protocols). The following two kinds of protocols are obtained by imposing 

restrictions on bounded protocols for decomposed networks of Definition]^ 

Semi-noisy protocol: An e-noise (n, k, d, D)-semi-noisy protocol differs from an e-noise (n, k, d, D)-protocol 
only in the following respects. 
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(a) When it is the turn of an input processor to send a message, it sends only its input bit, whose 
independent e-noisy copies are then received by its neighbors. 

(b) A transmission made by an auxiliary processor is not subjected to any noise. 

Noisy copy protocol: An e-noise (n, k, D)-noisy-copy protocol is an e-noise (n, k, 1, D)-semi-noisy proto¬ 
col; in other words, every input processor makes exactly one broadcast of its input bit, so that each of 
its neighbors receives exactly one independent e-noisy copy of this input bit. 

Remark 26. In these special kinds of protocols, the messages sent by the input processors does not de¬ 
pend on the messages these processors receive. Thus, we may assume that the input processors make their 
transmissions in the beginning of the protocol an appropriate number of times, and after that the auxiliary 
processors interact according to a zero noise protocol. 

Claim 27 (From bounded protocol to semi-noisy). For every function f : ({0,1}”)^ —>■ —1}, distribu¬ 

tion p, on ({0,1}"')^ and every e-noise {n, k, d, D)-protocol 11, there is an e-noise {n, k, d, 3D)-semi-noisy 
protocol Hi such that advj^^(n) < advj^^(ni). 

Claim 28 (From semi-noisy to noisy-copy). For every function f : ({0,1}"^)^ —)• —1}, distribution 

p on ({0, and every e-noise {n, k, d, D)-semi-noisy protocol Hi, there is an e^^-noise (n, k, D)-noisy- 
copy protocol II 2 such that advj^^(ni) < advj^^(n 2 ). 

Claim 29 (From noisy-copy to xnd tree). For every function f : ({0,1}”)^ —> —1}, distribution p on 

({0,1}”)^ and every e-noise (n, k, D)-noisy-copy protocol 112, there is an (n, k, D, e)-xnd tree T such that 
adv/,^(n 2 ) < adv/,^(7'). 

Lemma follows immediately from Claims |27j |28] and□ 

Proof of Claim\27\ Fix an e-noise {n,k,d,D)-protocol If on a graph G. We will construct an e-noise 
(n, k, d, 3L))-semi-noisy protocol IIi on a graph Gi = {Vi,Ei). The graph Gi will contain G as a sub¬ 
graph; however, all vertices inherited from G will correspond to auxiliary processors. In addition, for each 
input vertex v of G, we will have a new input vertex v' in Gi, which will be connected to v and its neighbors 
in G. Let (/ = Uj=i — ^0 U Uj=i ^j) be the decomposition corresponding to Ft. The decompo¬ 
sition corresponding to Tti will be (/' = Uj=i = ^0 U Uj=i where = {v' ■. v & /j} and 

A'- = Aj U Ij. 

Suppose n uses T transmissions. For i = 1,2,... ,T and v € V{G), let by[i] be the bit received by 
V when the i-th transmission is made; if v does not receive the i-th transmission, we define by[i] to be 0. 
The protocol Tli for simulating Ft will operate in T stages, one for each transmission made by If. The 
goal is to ensure that in the end each auxiliary processor v of Gi constructs a sequence b^ G {0,1}^, 
such that (6^ : G V{G)) and : n G )^(G)) (of the protocol If) have the same distribution, for every 
input in ({0,1}"^)^. This implies that the outputs of Ft' and Ft have the same distribution. Suppose the first 
1 — 1 stages have been successfully simulated and (5^[1,... , F — 1] : n G V{G)) have been appropriately 
constructed. We now describe how stage f is implemented and {b'^[f] : v G V{G)) are constructed. If the 
l-t\r transmission in Ft is made by an auxiliary processor v in G, then it will be simulated in Fti using 
one noiseless transmission from v, if the ^-th transmission is made by an input vertex v of G, then it 
will be simulated in Fli using two (noiseless) transmissions from v and one e-noisy transmission from the 
corresponding (newly added) input vertex v'. 
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V is an auxiliary vertex in G: The auxiliary vertex v in Gi operates exactly in the same fashion as 

in G, and sends a bit b, which is received without error by all its neighbors. Each neighbor w ^ V(G) of v 
independently sets its bit b'^[i] to be an e-noisy copy of b (using its internal randomness). 

V is an input vertex in G: The auxiliary vertex v in Gi has all the information that the corresponding 

input vertex n in G would have had, except the input (which is now given to the new input vertex v') . So, 
V transmits (with no noise) two bits, bo and bi, corresponding to the two possible input values that v' might 
have. Next, the input vertex v' transmits its input c; let denote the e-noisy version of c that the neighbor 
w ^ V (G) receives. Each neighbor m of u now acts as follows: if bo = bi, then it sets to be an e-noisy 
copy of bo (using its internal randomness); if 60 / 61 , then it sets b'^[£] to bc^. □ 

Proof of Claini\28\ Eet IIi be an e-noise (n,/c, d, T>)-semi-noisy protocol. As remarked above, all input 
processors in a semi-noisy protocol can be assumed to make their transmissions right in the beginning, after 
which only the auxiliary processors operate. Thus, each auxiliary processor receives at most d independent 
e-noisy copies of the input from each input processor in its neighborhood. The following lemma of Goyal, 
Kindler and Saks shows that a processor can generate d independent e-noisy copies of any input from 
one e'^-noisy copy. 

Lemma 30 (Goyal, Kindler and saks [h) (Eemma 36)). Let t be an arbitrary integer, e G (0,1/2) and 
7 = e*. There is a randomized algorithm that takes as input a single bit 6 and outputs a sequence of t bits 
and has the property that if the input is a ^-noisy copy off) (respectively ofl), then the output is a sequence 
of independent e-noisy copies off) (respectively ofl). 

We modify the protocol IIi to an e'^-noise (n, k, Zl)-noisy-copy protocol 112 by requiring that each input 
processor makes one e'^-noisy transmission of its input bit. Each auxiliary processor on receiving such a 
transmission uses its internal randomness to extract the required e-noisy copies. Then onwards the protocol 
proceeds as before. We may now fix internal randomness used by the auxiliary processors in such a way that 
the advantage of the resulting protocol for the input distribution p is at least as good as that of the original 
protocol. Thus, all processors use (deterministic) functions to compute the bit that they transmit. □ 

Proof of Claim\29\ Eet 112 be an e-noise (n, k, Z))-noisy-copy protocol, with the underlying decomposition 
(I = , A = Aq U "'ill now show how this protocol can be simulated using an 

(n, k, D, e)-xnd tree T. To keep our notation simple, we will assume (by introducing new edges, if neces¬ 
sary) that (a) all processors in A are adjacent, and (b) every processor in Aj is adjacent to every processor 
in Ip 

Eet T be the total number of transmissions in 112 . Let 61 , 62 ,..., 6 ^ be the sequence of bits trans¬ 
mitted in II 2 by the auxiliary processors. Suppose, bi is transmitted by vertex v G Ij by computing 
51(6162 • • • bi-i,Xj 0 Zv), where Xj is the the restriction of the input assignment to Ij and Zv is an e-noisy 
vector in { 0 , !}"■. 

The nodes of the xnd tree T are 0-1 sequences of length at most T (the root is the node at 0th level and 
corresponds to the empty sequence). The children of the node 6 G {0,1}* ^ (0 < i — 1 < T — 1) are the 
two vertices 60 and 61. Suppose vertex v G Aj makes the i-th transmission. The function that v computes to 
determine what to transmit, will be used to compute the successor of the nodes at the z — 1-th level. To state 
this formally, the label of 6 G {0,1}*“^ (at level z — 1 in T) is (j, h), where h{xj, z^) = 6 ■ gi{b, Xj © Zy). 
(Since our definition requires the function to return a child of 6 , h returns an extension of 6 in {0,1}*.) 
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The set of leaves of T, L{T), is preeisely {0,1}^. Let a : L{T) —> {+1,-1} be defined by 
0(6162 • • • 6 t) = (—1)^^. Then, it follows from our definitions that 


adv©,;,(r) > |E[©(x)a(T(x))]| 

= |E[©(x)(-l^]| 

= adv®,^(n2). 

□ 


5.2 Tree rearrangement 

Our main observation in this seetion is that oblivious deeision trees ean be assumed to be ordered when the 
inputs eome from a produet distribution, and we wish to approximate the parity funetion. To show this we 
will deseribe a method for rearranging an arbitrary oblivious deeision tree so that it beeomes ordered. 

Definition 31 (Tree rearrangement). Let T and'T' be oblivious decision trees for the same set of inputs. 
We say that T' is a rearrangement of tree T if 

• both trees query each variable the same number of times; 

• the functions labelling vertices ofT' also appear in T (up to obvious renaming of children); formally, 

for every vertex v in T' labelled {i,g), there is a vertex v in T labelled {i,g) in T and a bijection 
TT : ^ Cv such that \/x € Si : g(x) = Tr{g(x)). 

Lemma 32 (Ordering lemma). Let p, be a product distribution on some set S^. Let f : {+1,-1} 

be of the form /(xi, X 2 , • • •, xf) = h{xi)h{x 2 ) ■ ■ ■ h{xk), where h : S ^ {+1; “!}■ Then every oblivious 
decision tree T can be rearranged to obtain an ordered oblivious decision tree T such that advj^^(T) > 

adv/,^(r). 

This lemma will follow immediately from the following elaim. 

Claim 33 (Move to root). Let pbe a product distribution on S^. Let f : ^ {+1; “1} be of the form 

/(xi,X 2 ,... ,Xfc) = h{xi)h(x 2 ) ■ ■ ■ h{xk), where h : S ^ {+1; “!}• Tet T be an oblivious decision tree 
with inputs in such that the input Xn is queried only at the level just above the leaves. Then, T can be 
rearranged to obtain a tree T where 

1. the input x^ is queried only at the root; 

2. for all j 7 ^ k, if Xj was queried at level r ofT, then Xj is queried at level r + 1 ofT; 

3. adv/_^(T) > adv/_^(T). 

Proof. Let X = {Xi,X 2 ,..., Xf) take values in with distribution p\ sinee /r is a produet distribution 
the Xi& are independent. Suppose T makes t queries to the input. Let vi, V 2 ,..., vt+i be the random 
sequenee of vertiees visited by the eomputation of T on input X. Fix 6 : L(T) [—1, +1] sueh that 

advy,^(r) = \E[h{X,)h{X2) ■ ■ ■ h{Xk)b(srt+,)]\ 

= \E[E[h{Xi)---h{Xk)b{g^,{Xk))\srt]]\. 

Sinee Xk is queried only at the end, h{Xi)... h{Xk-i) and b{gvt{Xk)) are independent given v^, so 
E[h{Xi)...h{Xk-i)h{Xk)b(g^,{Xk))\srt]=E[h{Xi)...h{Xk-i)\srt]-E[h{Xk)b{g^,{Xk))\srt]. 
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Let q;(?;) = 'E\h{Xi)... h{Xk-i) | vj = u] and/3(i;) = E[h{Xk)b{g^^{Xk)) \ v* = ?;]. Let?;* = 
argmax/3(?;); thus, among the functions labelling vertices that query X^ (at level t), g^* has the best 
advantage in the tree for h under the distribution of X^. It is thus natural to expect (and not hard to verify) 
that if we replace all queries to Xk by this query g^* , the overall advantage can only improve. Once this is 
done, the last query does not depend on the previous query, and can, therefore, be moved to the root. We 
now present the argument formally. We have, 

adv/,^(r) = |E[a(vi)/3(vi)]| (3) 

< E[K?;r)|] • |/3(^*)|. 

We are now ready to describe the rearrangement of T. Let T~ be the subtree of T consisting of the 
first t levels of vertices; thus vertices where X^ is queried in T become leaves in T~. We first make \Cy* \ 
copies of T^; we refer to these copies by 7^~ (c G C*^*), and assume that the root of 7^~ is renamed c. 
In the new tree T, we have a root with label {k,gv*) which is connected to the subtrees T~ ■ We claim 
that advy_^(T) > advy^^(T). Indeed, consider the function b : L{T) [—1,+1] that takes the value 

sign(a(?;))6(c) on the leaf in corresponding to ?; G L{T^). Then, we have 

advy,^(r) > \¥\h{Xi)h{X2)---h{Xu)b{vT)]\ (4) 

= E[Kvt)|] • |/3(?;*)|. 

Claim |33]now follows by combining (l3]l and dUl. □ 

We are now ready to show how trees computing the parity function can be reordered, and prove Lemma[32j 
The argument essentially involves repeated application of Claim[33]to place all queries made to a variable in 
adjacent levels. We state the argument formally by considering a carefully defined minimal counterexample. 

Proof of Lemma^^ Fix an oblivious decision tree T. Let the depth T be r. We say that there is an alter¬ 
nation at level ^ G {3,..., r} of T if the variable queried at level i. is queried at a level before £ — 1 but not 
at level I — 1. Clearly, a tree with no alternations is an ordered tree. Among all rearrangements of T, let T 
be such that 

(PI) adv/,^(T) > adv/,^(T); 

(P2) among all T satisfying (PI), T has the fewest alternations; 

(P3) among all T satisfying (PI) and (P2), the last alternation in T is farthest from the root. 

We claim that T has no alternations. Let us assume that T has alternations and arrive at a contradiction. Let 
T' be the tree obtained from T by merging queries on adjacent levels into one superquery. That is, if there 
are j adjacent levels somewhere in the tree that query Xi, with two outcomes, then we replace these j levels 
by a single superquery with 2^ outcomes. Note that the number of alternations in T' is the same as in T. 
Let r' be the number of queries in T' . We consider two cases: 

T' does not have an alternation at level r'\ Let xi be the variables queried at level r' . By Claim 1^ 
we obtain a tree T" where the superquery to xi appears only at the root, and all other superqueries are 
shifted one level down. Now, however, if each superquery in T” is replaced by its corresponding subtree of 
queries from T, then we obtain a rearrangement of T satisfying (PI) and (P2), but with alternation at a level 
farther from the root, contradicting (P3). 
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T' has an alternation at level r': Suppose xi is queried at level r', and the previous query to xi is 

at level r" < r' (with no queries to xi in the levels r" + 1, r" + 2,..., r' — 1). Now, we apply Claimto 
the subtrees of T' rooted at level r" + 1, thereby obtaining a rearrangement T”, where xi is now queried at 
levels r" +1 instead of at level r' . Clearly, the resulting tree T" has fewer alternations than T'. Furthermore, 
if each superquery in T" is replaced by its corresponding tree of queries from T, we obtain a rearrangement 
of T. It can be verified that this rearrangement has advantage at least no worse than T but has fewer 
alternations—contradicting (P2). □ 

5.3 Obtaining the read-once decision tree 

Proof of LemmaU7\ By combining Lemmas |24] and |32l we see that If can be converted into an ordered 
(n, k, 3D, e'^)-xnd tree. Since this tree is ordered all queries to any particular variable appear in consecutive 
levels. In our final free we will combine all fhese queries info a single query. In parficular, if fhere are 
i < 3D levels fhaf query (xj, Zi), fhen we collapse fhem, so as fo yield a single query wifh 2^ oufcomes. 
Nofe, however, fhaf fhe resulf of Ibis query depends nof only on fhe real inpuf in Xi G {0,1}” buf also 
on fhe noise variable Zi. In fhe final noisy decision free T, we regard Ibis superquery g{xi) as a noisy 
function of fhe inpuf Xj, wifh Zi providing fhe infernal randomness for ifs compufafion. Since g{xi) was 
derived from an (n, 1,^, e‘^)-xnd free wifh £ < 3D , if follows from fhe definition of Q;^(n, 3D, e'^) fhaf 
adv©,^(5f) < a^{n, 3D, □ 


6 Analysis of read-once decision trees 

In fhis section, we will prove Lemma [18] We will make use of fhe following proposition. 

Proposition 34. Let X be a random variable taking values in {0, with distribution /r. Then, for all 
f : { 0 , 1 }*" ^ {+ 1 , - 1 }, A : { 0 , 1}^ ^Canda:C ^ M, 


\E[f{X)a{A{X))]\ < |a| • advy^^(^), 

where |a| = maxcgc |a(c)|- 
Proof. 


\E[f{X)a{A{Xm 


= \Y,E[f{X)a{A{X))\AiX) = c] • Pr[.4(X) = c]| 

cgC 

< E ■ I E[f{X)\A{X) = c]| • Pr[^(X) = c] 

cGC 

< mc^|a(c)| • '^\E[f{X)\A{X) = c]| • Pr[^(X) = c] 

Ct O 

c€C 

= |a| ■Y,E[f{X)b{A{X))\A{X) = c] • Pr[.4(X) = c] 

cGC 

< \a\-\E[f{X)b{A{X))]\ 

< |a| • advy_^(^), 


where b : C ^ {+1, —1} is defined as b{c) = sign(E[/(2f)|>t(2f) = c]) for all c ^ C. 


□ 


15 


Proof of Lemman8[ Fix b : L{T) [—1,+!]. Let X take values in ({0,1}”^)^ with distribution /i^. 

We wish to show that 

m{X)h{nX))]\<a\ 

Let the (random) sequence of vertices visited by the computation of T on input X be vi, V 2 ,..., , v^+i. 

For i = 1, 2,..., /c and v in level i of the tree (at distance i — 1 from the root) let 

ai{v) = P\h{Xi)h{Xi+i) ■ ■ ■ h{Xk)b{-Vk+i) \ v* = u]. 

We will show by reverse induction on i that |ai(u)| < The claim will then follow by taking i to 

be 1 and v to be the root of T. For the base case, we have 

ak{v) = E[/i(Xfc)6(vfc+i) \vk = v] 

= E[h{Xk)b{g,{Xk))] 

< advhAdv) < a. 

For the induction step assume that i < k and that |a;i+i(m)| < for all vertices w in level i + 1 of 
the tree (at distance i from the root). Then, for a vertex v in level i, we have 

\oii{v) \ = I E[h{Xi)h{Xi+i) ■ ■ ■ h{Xk)b{-Vk+i) | Vj = u]| 

= \E[h{Xi)ai+MXi))]\ 

< adv/i • max|ai+i(r(;)| 

W 

where we used Proposition [34] to justify the second last inequality, and the induction hypothesis to justify 
the last inequality. □ 

7 Conclusions 

In this paper, we presented the first lower bound result for the realistic model of wireless communication 
networks where there is a restriction on transmission power. Any bit sent by a transmitter is received (with 
channel noise) only by receivers which are within the transmission radius of the transmitter. We showed 
that to compute the parity of N input bits with constant probability of error, we need ^(A^loglog A") 
transmissions. This result nicely complements the upper bound result of Ying, Srikant and Dullerud lIT^ . 
which showed that 0{NloglogN) transmissions are sufficient for computing the sum of all the N bits. 
Our result also implies that the sum of N bits cannot be approximated up to a constant additive error by any 
constant error protocol for Af{N, R) using o{N log log N) transmissions, if i? < for some /3 > 0. 

Although the techniques of network decomposition and translation of bounded protocols to xnd trees 
are fairly general, some crucial parts of our proof are not. In particular, rearrangement of xnd trees to get 
ordered xnd trees and analysis of read-once decision trees used the fact that we are trying to compute the 
parity function. Thus the same proof does not yield similar lower bounds for other functions like majority. 
In subsequent work, we have eliminated the need for these parts of the proof using entirely different argu¬ 
ments. We have thus succeeded in showing lower bound of ^(A^loglog A^) transmissions for computing 
the majority and other functions. These results also show that one cannot approximate the sum of N bits to 
within an additive error of A^" (for some a > 0) using o{N log log N) transmissions. 
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